An analysis of data distribution methods for Gaussian elimination in distributed-memory multicomputers
نویسنده
چکیده
In multicomputers, an appropriate data distribution is crucial for reducing communication overhead and therefore the overall performance. For this reason, data parallel languages provide programmers with primitives, such as BLOCK and CYCLIC that can be used to distribute data across the distributed memory. However, the languages do not aid the programmer as to how the distribution should be performed to maximize the performance. Therefore, this paper presents an analysis of data distribution methods for overlapping computation and communication in the Gaussian elimination algorithm. The analysis indicates that both BLOCK and CYCLIC distributions have their own merit; however, BLOCK_CYCLIC with its hybrid characteristic consistently out performs its counterparts.
منابع مشابه
Techniques for Compiling Programs on Distributed Memory Multicomputers
It is widely accepted that distributed memory parallel computers will play an important role in solving computation-intensive problems. However, the design of an algorithm in a distributed memory system is time-consuming and error-prone, because a programmer is forced to manage both parallelism and communication. In this paper, we present techniques for compiling programs on distributed memory ...
متن کاملToward Automatic Distribution
This paper considers the problem of distributing data and code among the processors of a distributed memory supercomputer. Provided that the source program is amenable to detailed dataaow analysis, one may determine a placement function by an algorithm analogous to Gaussian elimination. Such a function completely characterizes the distribution by giving the identity of the virtual processor on ...
متن کاملCompiler Techniques for Determining Data Distribution and Generating Communication Sets on Distributed-Memory Multicomputers
This paper is concerned with designing e cient algorithms for determining data distribution and generating communication sets on distributed memory multicomputers First we propose a dynamic programming algorithm to automatically determine data distribution at compiling time This approach is di erent from previous research works which only allow programmers explicitly to specify the data distrib...
متن کاملA Rotate-Tiling Image Composition Method for Parallel Volume Rendering on Distributed Memory Multicomputers
The binary-swap and the parallel-pipelined methods are two popular image composition methods for volume rendering on distributed memory multicomputers. However, these methods either restrict the number of processors to a power of two or require many steps to transform image data that results in high communication overheads. In this paper, we present an efficient image composition method, the ro...
متن کاملEnhancement of parallelism for tearing-based circuit simulation
| A new circuit simulation system is presented with techniques \Subcircuit Balancing with Estimated Update operation count"(SBEU) and \Asynchronous Distributed Row-based interconnection parallelization"(A-DR). SBEU estimates Gaussian elimination cost of each subcircuit by counting number of update operations to achieve balanced circuit partitioning. A-DR makes it possible to overlap numerical o...
متن کامل